230 research outputs found

    Using Dali for Protein Structure Comparison

    Get PDF
    The exponential growth in the number of newly solved protein structures makes correlating and classifying the data an important task. Distance matrix alignment (Dali) is used routinely by crystallographers worldwide to screen the database of known structures for similarity to newly determined structures. Dali is easily accessible through the web server (http://ekhidna.biocenter.helsinki.fi/dali). Alternatively, the program may be downloaded and pairwise comparisons performed locally on Linux computers. © 2020, Springer Science+Business Media, LLC, part of Springer Nature.Peer reviewe

    Dali server : structural unification of protein families

    Get PDF
    Protein structure is key to understanding biological function. Structure comparison deciphers deep phylogenies, providing insight into functional conservation and functional shifts during evolution. Until recently, structural coverage of the protein universe was limited by the cost and labour involved in experimental structure determination. Recent breakthroughs in deep learning revolutionized structural bioinformatics by providing accurate structural models of numerous protein families for which no structural information existed. The Dali server for 3D protein structure comparison is widely used by crystal-lographers to relate new structures to pre-existing ones. Here, we report two most recent upgrades to the web server: (i) the foldomes of key organisms in the AlphaFold Database (version 1) are searchable by Dali, (ii) structural alignments are annotated with protein families. Using these new features, we discovered a novel functionally diverse subgroup within the WRKY/GCM1 clan. This was accomplished by linking the structurally characterized SWI/SNF and NAM families as well as the structural models of the CG-1 family and uncharacterized proteins to the structure of Gti1/Pac2, a previously known member of the WRKY/GCM1 clan. The Dali server is available at http://ekhidna2.biocenter.helsinki.fi/dali. This website is free and open to all users and there is no login requirement. [GRAPHICS] .Peer reviewe

    TOPAZ: asymmetric suffix array neighbourhood search for massive protein databases

    Get PDF
    Protein homology search is an important, yet time-consuming, step in everything from protein annotation to metagenomics. Its application, however, has become increasingly challenging, due to the exponential growth of protein databases. In order to perform homology search at the required scale, many methods have been proposed as alternatives to BLAST that make an explicit trade-off between sensitivity and speed. One such method, SANSparallel, uses a parallel implementation of the suffix array neighbourhood search (SANS) technique to achieve high speed and provides several modes to allow for greater sensitivity at the expense of performance.Peer reviewe

    “PANNZER – a practical tool for protein function prediction”

    Get PDF
    ABSTRACT The facility of next-generation sequencing has led to an explosion of gene catalogs for novel genomes, transcriptomes and metagenomes, which are functionally uncharacterized. Computational inference has emerged as a necessary substitute for first-hand experimental evidence. PANNZER (Protein ANNotation with Z-scoRE) is a high-throughput functional annotation web server that stands out among similar publically accessible web servers in supporting submission of up to 100,000 protein sequences at once and providing both Gene Ontology (GO) annotations and free text description predictions. Here, we demonstrate the use of PANNZER and discuss future plans and challenges. We present two case studies to illustrate problems related to data quality and method evaluation. Some commonly used evaluation metrics and used evaluation datasets promote methods that that favor unspecific and broad classes over more informative and specific classes. We argue that this can bias the development of automated function prediction methods. The PANNZER web server and source code are available at http://ekhidna2.biocenter.helsinki.fi/sanspanz/. This article is protected by copyright. All rights reserved.Peer reviewe

    Dali server update

    Get PDF
    The Dali server (http://ekhidna2.biocenter.helsinki.fi/dali) is a network service for comparing protein structures in 3D. In favourable cases, comparing 3D structures may reveal biologically interesting similarities that are not detectable by comparing sequences. The Dali server has been running in various places for over 20 years and is used routinely by crystallographers on newly solved structures. The latest update of the server provides enhanced analytics for the study of sequence and structure conservation. The server performs three types of structure comparisons: (i) Protein Data Bank (PDB) search compares one query structure against those in the PDB and returns a list of similar structures; (ii) pairwise comparison compares one query structure against a list of structures specified by the user; and (iii) all against all structure comparison returns a structural similarity matrix, a dendrogram and a multidimensional scaling projection of a set of structures specified by the user. Structural superimpositions are visualized using the Java-free WebGL viewer PV. The structural alignment view is enhanced by sequence similarity searches against Uniprot. The combined structure-sequence alignment information is compressed to a stack of aligned sequence logos. In the stack, each structure is structurally aligned to the query protein and represented by a sequence logo.Peer reviewe

    Novel comparison of evaluation metrics for gene ontology classifiers reveals drastic performance differences

    Get PDF
    Author summary In the biosciences, predictive methods are becoming increasingly necessary as novel sequences are generated at an ever-increasing rate. The volume of sequence data necessitates Automated Function Prediction (AFP) as manual curation is often impossible. Unfortunately, selecting the best AFP method is complicated by researchers using different evaluation metrics. Furthermore, many commonly-used metrics can give misleading results. We argue that the use of poor metrics in AFP evaluation is a result of the lack of methods to benchmark the metrics themselves. We propose an approach called Artificial Dilution Series (ADS). ADS uses existing data sets to generate multiple artificial AFP results, where each result has a controlled error rate. We use ADS to understand whether different metrics can distinguish between results with known quantities of error. Our results highlight dramatic differences in performance between evaluation metrics. Automated protein annotation using the Gene Ontology (GO) plays an important role in the biosciences. Evaluation has always been considered central to developing novel annotation methods, but little attention has been paid to the evaluation metrics themselves. Evaluation metrics define how well an annotation method performs and allows for them to be ranked against one another. Unfortunately, most of these metrics were adopted from the machine learning literature without establishing whether they were appropriate for GO annotations. We propose a novel approach for comparing GO evaluation metrics called Artificial Dilution Series (ADS). Our approach uses existing annotation data to generate a series of annotation sets with different levels of correctness (referred to as their signal level). We calculate the evaluation metric being tested for each annotation set in the series, allowing us to identify whether it can separate different signal levels. Finally, we contrast these results with several false positive annotation sets, which are designed to expose systematic weaknesses in GO assessment. We compared 37 evaluation metrics for GO annotation using ADS and identified drastic differences between metrics. We show that some metrics struggle to differentiate between different signal levels, while others give erroneously high scores to the false positive data sets. Based on our findings, we provide guidelines on which evaluation metrics perform well with the Gene Ontology and propose improvements to several well-known evaluation metrics. In general, we argue that evaluation metrics should be tested for their performance and we provide software for this purpose (). ADS is applicable to other areas of science where the evaluation of prediction results is non-trivial.Peer reviewe

    Kokemuksellinen, pragmaattinen ja viiteryhmään asemmoitu seinäkoristesuhde : 30:n kodin seinäkoristemaailmat distinktioteoreettisessa viitekehyksessä

    Get PDF
    Only abstract. Paper copies of master’s theses are listed in the Helka database (http://www.helsinki.fi/helka). Electronic copies of master’s theses are either available as open access or only on thesis terminals in the Helsinki University Library.Vain tiivistelmä. Sidottujen gradujen saatavuuden voit tarkistaa Helka-tietokannasta (http://www.helsinki.fi/helka). Digitaaliset gradut voivat olla luettavissa avoimesti verkossa tai rajoitetusti kirjaston opinnäytekioskeilla.Endast sammandrag. Inbundna avhandlingar kan sökas i Helka-databasen (http://www.helsinki.fi/helka). Elektroniska kopior av avhandlingar finns antingen öppet på nätet eller endast tillgängliga i bibliotekets avhandlingsterminaler.Tässä tutkielmassa tarkastellaan iältään 22-56 -vuotiaiden suomalaisten olohuoneinteriöörejä, ja erityisesti esille nostettuja seinäkoristeita distinktioteoreettisessa viitekehyksessä. Tutkimuksen oletuksena on, että seinäkoristeet toimivat välineinä esillepanijansa maun ja tyyliin esittämisessä. Sosiologisesti tarkasteltuna seinäkoristus toimiikin paitsi sisustamisen myös erottautumisen ja samaistumisen välineenä. Tutkimuksen teoreettinen viitekehys nojaa pitkälti ranskalaisen sosiologin, Pierre Bourdieun, distinktioteorian varaan: Keskeiselle asemalle nousevat tulkinnat taidemausta ja sen merkityksestä habituksen sekä kulttuurisen kompetenssin määrittelyssä. Lisäksi tutkielmassa on hyödynnetty kotiympäristöön, esinesuhteisiin ja kulutukseen liittyvää muuta keskustelua. Aiempi, seinäkoristeita tutkimus rajautuu ruotsalaisen taidehistorioitsijan, Eva Londosin, vuonna 1993 julkaistun väitöskirjatutkimuksen varaan. Londos on tehnyt muun muassa erottautumiseen liittyviä tulkintoja, ja siten tutkimus tarjoaakin ehdotelmia tarkastellun seinäkoristemaailman jäsentämiseksi: Seinäkoristeita on luokiteltu myös tässä tutkimuksessa toteutusmateriaalien, tyylin ja aiheen osalta. Tutkielma perustuu laadullisiin menetelmiin, joissa keskeisellä asemalla on paitsi tutkimuksen teoreettinen viitekehys myös empiirinen aineisto. Aineistona käytetään vuosien 1996-1997 vaihteessa Taideteollisen korkeakoulun toimesta kerättyä haastattelu- ja kuva-materiaalia. Aineisto sisältää 30:n kodin litteroidut haastattelut sekä kotikohtaista valokuvamateriaalia. Seinäkoristeet tarjoavat erään välineistön yhteiskuntaan asemoitumiselle. Seinäkoristeiden avulla voidaan "ottaa yhteyttä" erilasiin viiteryhmiin, merkitä oma asema niissä ja erottautua toisista. Seinäkuvien kautta voidaan osoittaa omaa erityisyyttä. Kaiken kaikkiaan seinäkoristeilla on asemansa kamppailussa legitiimin maun ja tyylin määrittämisestä
    corecore